AITopics | top feature

Collaborating Authors

top feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CaliciBoost: Performance-Driven Evaluation of Molecular Representations for Caco-2 Permeability Prediction

Van Le, Huong, Ren, Weibin, Kim, Junhong, Yun, Yukyung, Park, Young Bin, Kim, Young Jun, Han, Bok Kyung, Choi, Inho, Park, Jong IL, Yun, Hwi-Yeol, Choi, Jae-Mun

arXiv.org Artificial IntelligenceJun-11-2025

ABSTRACT Caco-2 permeability serves as a critical in vitro indicator to predict oral absorption of drug candidates during early-stage drug discovery. To improve the precision and efficiency of computational predictions, we systematically investigated the impact of eight types of molecular feature representation including 2D / 3D descriptors, structural fingerprints and deep learning-based embeddings combined with automated machine learning techniques to predict Caco-2 permeability. Using two datasets of differing scale and diversity (TDC benchmark and curated OCHEM data), we assessed model performance across representations and identified PaDEL, Mordred, and RDKit descriptors as particularly effective for Caco-2 prediction. Notably, the AutoML-based model CaliciBoost achieved the best MAE performance. Furthermore, for both PaDEL and Mordred representations, the incorporation of 3D descriptors resulted in a 15.73% reduction in MAE compared to using 2D features alone, as confirmed by feature importance analysis. These findings highlight the effectiveness of AutoML approaches in ADMET modeling and offer practical guidance for feature selection in data-limited prediction tasks. INTRODUCTION Caco-2 cell permeability is a widely used in vitro proxy for assessing the intestinal absorption of drug candidates in early-stage drug discovery.

artificial intelligence, descriptor, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.08059

Country: Asia > South Korea > Gyeongsangbuk-do (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Re-Visiting Explainable AI Evaluation Metrics to Identify The Most Informative Features

Salih, Ahmed M.

arXiv.org Machine LearningJan-31-2025

Functionality or proxy-based approach is one of the used approaches to evaluate the quality of explainable artificial intelligence methods. It uses statistical methods, definitions and new developed metrics for the evaluation without human intervention. Among them, Selectivity or RemOve And Retrain (ROAR), and Permutation Importance (PI) are the most commonly used metrics to evaluate the quality of explainable artificial intelligence methods to highlight the most significant features in machine learning models. They state that the model performance should experience a sharp reduction if the most informative feature is removed from the model or permuted. However, the efficiency of both metrics is significantly affected by multicollinearity, number of significant features in the model and the accuracy of the model. This paper shows with empirical examples that both metrics suffer from the aforementioned limitations. Accordingly, we propose expected accuracy interval (EAI), a metric to predict the upper and lower bounds of the the accuracy of the model when ROAR or IP is implemented. The proposed metric found to be very useful especially with collinear features.

accuracy, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

2502.00088

Country:

Europe > United Kingdom > England > Leicestershire > Leicester (0.05)
North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Iraq > Kurdistan Region > Duhok Governorate > Zakho (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.92)

Add feedback

XAI-based Feature Ensemble for Enhanced Anomaly Detection in Autonomous Driving Systems

Nazat, Sazid, Abdallah, Mustafa

arXiv.org Artificial IntelligenceOct-20-2024

The rapid advancement of autonomous vehicle (AV) technology has introduced significant challenges in ensuring transportation security and reliability. Traditional AI models for anomaly detection in AVs are often opaque, posing difficulties in understanding and trusting their decision making processes. This paper proposes a novel feature ensemble framework that integrates multiple Explainable AI (XAI) methods: SHAP, LIME, and DALEX with various AI models to enhance both anomaly detection and interpretability. By fusing top features identified by these XAI methods across six diverse AI models (Decision Trees, Random Forests, Deep Neural Networks, K Nearest Neighbors, Support Vector Machines, and AdaBoost), the framework creates a robust and comprehensive set of features critical for detecting anomalies. These feature sets, produced by our feature ensemble framework, are evaluated using independent classifiers (CatBoost, Logistic Regression, and LightGBM) to ensure unbiased performance. We evaluated our feature ensemble approach on two popular autonomous driving datasets (VeReMi and Sensor) datasets. Our feature ensemble technique demonstrates improved accuracy, robustness, and transparency of AI models, contributing to safer and more trustworthy autonomous driving systems.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.15405

Country:

North America > United States > Indiana > Marion County > Indianapolis (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore > Central Region > Singapore (0.04)
Asia > India (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Security & Privacy (1.00)
Automobiles & Trucks (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Critical Assessment of Interpretable and Explainable Machine Learning for Intrusion Detection

Subasi, Omer, Cree, Johnathan, Manzano, Joseph, Peterson, Elena

arXiv.org Artificial IntelligenceJul-4-2024

There has been a large number of studies in interpretable and explainable ML for cybersecurity, in particular, for intrusion detection. Many of these studies have significant amount of overlapping and repeated evaluations and analysis. At the same time, these studies overlook crucial model, data, learning process, and utility related issues and many times completely disregard them. These issues include the use of overly complex and opaque ML models, unaccounted data imbalances and correlated features, inconsistent influential features across different explanation methods, the inconsistencies stemming from the constituents of a learning process, and the implausible utility of explanations. In this work, we empirically demonstrate these issues, analyze them and propose practical solutions in the context of feature-based model explanations. Specifically, we advise avoiding complex opaque models such as Deep Neural Networks and instead using interpretable ML models such as Decision Trees as the available intrusion datasets are not difficult for such interpretable models to classify successfully. Then, we bring attention to the binary classification metrics such as Matthews Correlation Coefficient (which are well-suited for imbalanced datasets. Moreover, we find that feature-based model explanations are most often inconsistent across different settings. In this respect, to further gauge the extent of inconsistencies, we introduce the notion of cross explanations which corroborates that the features that are determined to be impactful by one explanation method most often differ from those by another method. Furthermore, we show that strongly correlated data features and the constituents of a learning process, such as hyper-parameters and the optimization routine, become yet another source of inconsistent explanations. Finally, we discuss the utility of feature-based explanations.

dataset, explanation, interpretable ml, (16 more...)

arXiv.org Artificial Intelligence

2407.04009

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

User Identification via Free Roaming Eye Tracking Data

Haria, Rishabh Vallabh Varsha, Abed, Amin El, Maneth, Sebastian

arXiv.org Artificial IntelligenceMar-14-2024

We present a new dataset of "free roaming" (FR) and "targeted roaming" (TR): a pool of 41 participants is asked to walk around a university campus (FR) or is asked to find a particular room within a library (TR). Eye movements are recorded using a commodity wearable eye tracker (Pupil Labs Neon at 200Hz). On this dataset we investigate the accuracy of user identification using a previously known machine learning pipeline where a Radial Basis Function Network (RBFN) is used as classifier. Our highest accuracies are 87.3% for FR and 89.4% for TR. This should be compared to 95.3% which is the (corresponding) highest accuracy we are aware of (achieved in a laboratory setting using the "RAN" stimulus of the BioEye 2015 competition dataset). To the best of our knowledge, our results are the first that study user identification in a non laboratory setting; such settings are often more feasible than laboratory settings and may include further advantages. The minimum duration of each recording is 263s for FR and 154s for TR. Our best accuracies are obtained when restricting to 120s and 140s for FR and TR respectively, always cut from the end of the trajectories (both for the training and testing sessions). If we cut the same length from the beginning, then accuracies are 12.2% lower for FR and around 6.4% lower for TR. On the full trajectories accuracies are lower by 5% and 52% for FR and TR. We also investigate the impact of including higher order velocity derivatives (such as acceleration, jerk, or jounce).

accuracy, dataset, tr dataset, (13 more...)

arXiv.org Artificial Intelligence

2403.09415

Country:

Europe > Germany > Bremen > Bremen (0.29)
Europe > Latvia > Riga Municipality > Riga (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)

Add feedback

An Experiment on Feature Selection using Logistic Regression

Islam, Raisa, Mazumdar, Subhasish, Islam, Rakibul

arXiv.org Artificial IntelligenceJan-31-2024

In supervised machine learning, feature selection plays a very important role by potentially enhancing explainability and performance as measured by computing time and accuracy-related metrics. In this paper, we investigate a method for feature selection based on the well-known L1 and L2 regularization strategies associated with logistic regression (LR). It is well known that the learned coefficients, which serve as weights, can be used to rank the features. Our approach is to synthesize the findings of L1 and L2 regularization. For our experiment, we chose the CIC-IDS2018 dataset owing partly to its size and also to the existence of two problematic classes that are hard to separate. We report first with the exclusion of one of them and then with its inclusion. We ranked features first with L1 and then with L2, and then compared logistic regression with L1 (LR+L1) against that with L2 (LR+L2) by varying the sizes of the feature sets for each of the two rankings. We found no significant difference in accuracy between the two methods once the feature set is selected. We chose a synthesis, i.e., only those features that were present in both the sets obtained from L1 and that from L2, and experimented with it on more complex models like Decision Tree and Random Forest and observed that the accuracy was very close in spite of the small size of the feature set. Additionally, we also report on the standard metrics: accuracy, precision, recall, and f1-score.

accuracy, experiment, regularization, (15 more...)

arXiv.org Artificial Intelligence

2402.00201

Country: North America > United States > New Mexico > Socorro County > Socorro (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.57)

Add feedback

Study Features via Exploring Distribution Structure

Cao, Chunxu, Zhang, Qiang

arXiv.org Artificial IntelligenceJan-15-2024

In this paper, we present a novel framework for data redundancy measurement based on probabilistic modeling of datasets, and a new criterion for redundancy detection that is resilient to noise. We also develop new methods for data redundancy reduction using both deterministic and stochastic optimization techniques. Our framework is flexible and can handle different types of features, and our experiments on benchmark datasets demonstrate the effectiveness of our methods. We provide a new perspective on feature selection, and propose effective and robust approaches for both supervised and unsupervised learning problems.

dataset, feature selection, redundancy, (15 more...)

arXiv.org Artificial Intelligence

2401.0754

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Oceania > New Zealand > North Island > Waikato (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond The True Class

Zhu, Hongbo, Cangelosi, Angelo, Sen, Procheta, Mukherjee, Anirbit

arXiv.org Artificial IntelligenceDec-7-2023

In this work, we instantiate a novel perturbation-based multi-class explanation framework, LIPEx (Locally Interpretable Probabilistic Explanation). We demonstrate that LIPEx not only locally replicates the probability distributions output by the widely used complex classification models but also provides insight into how every feature deemed to be important affects the prediction probability for each of the possible classes. We achieve this by defining the explanation as a matrix obtained via regression with respect to the Hellinger distance in the space of probability distributions. Ablation tests on text and image data, show that LIPEx-guided removal of important features from the data causes more change in predictions for the underlying model than similar tests based on other saliency-based or feature importance-based Explainable AI (XAI) methods. It is also shown that compared to LIME, LIPEx is more data efficient in terms of using a lesser number of perturbations of the data to obtain a reliable explanation. This data-efficiency is seen to manifest as LIPEx being able to compute its explanation matrix around 53% faster than all-class LIME, for classification experiments with text data.

explanation, lipex, perturbation, (17 more...)

arXiv.org Artificial Intelligence

2310.04856

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Law (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.87)

Add feedback

Generating Bayesian Network Models from Data Using Tsetlin Machines

Blakely, Christian D.

arXiv.org Artificial IntelligenceMay-17-2023

Bayesian networks (BN) are directed acyclic graphical (DAG) models that have been adopted into many fields for their strengths in transparency, interpretability, probabilistic reasoning, and causal modeling. Given a set of data, one hurdle towards using BNs is in building the network graph from the data that properly handles dependencies, whether correlated or causal. In this paper, we propose an initial methodology for discovering network structures using Tsetlin Machines.

artificial intelligence, machine learning, node, (20 more...)

arXiv.org Artificial Intelligence

2305.10538

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

best-ai-powered-photo-organizers

#artificialintelligenceMar-5-2023, 19:48:50 GMT

As we continue to accumulate digital photos on our devices, it can be challenging to keep them organized and easy to find. But artificial intelligence (AI) has made things easier by enabling a wide range of intelligent organization features. AI-powered photo organizers use machine learning algorithms to automatically tag, sort, and categorize photos based on their content, date, location, and other factors. These intelligent tools are becoming essential in the digital age, allowing us to quickly locate specific photos and share them with ease. One of the best AI-powered photo organizers on the market is PhotoPrism, an app that helps users manage and organize their digital photo collection more efficiently and effectively.

automatically tag, criteria, top feature, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback